Conversation
…language support Agent-Logs-Url: https://github.com/Himaan1998Y/pretext/sessions/ca665dce-f115-4eb3-87c1-8ca621c70083 Co-authored-by: Himaan1998Y <210527591+Himaan1998Y@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Implement divergence classifier and multi-language support
feat: Measurement Validator — Divergence Classifier & Multi-Language Support (Phase 1 + 2)
Apr 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a new
src/measurement-validator/subsystem that compares Pretext canvas-based line measurements against DOM Range-API measurements, classifies root causes of divergences, and validates across 5 language groups.Core modules
types.ts—MeasurementSample,MeasurementResult,DivergenceAnalysis,FixtureSample,TestSuiteReport, tolerance configdom-adapter.ts— Range-API DOM adapter; extracts per-line text + widths without synthetic reflowcomparator.ts— RunslayoutWithLines()against DOM metrics; assignspass/minor/major/criticalper linereport-generator.ts— JSON + console formatters for single results and suite aggregatesclassifier.ts— Priority-ordered root-cause detector; both async (with DOM font-fallback check) and sync variantstest-suite.ts— Multi-language runner with per-LanguageGroupstats aggregationindex.ts— Public API surfaceClassifier detection chain
Strategies fire in priority order; first match wins:
rootCausefont_fallbackserif; compare totalsbidi_shapingemoji_rendering\p{Emoji_Presentation}browser_quirksystem-ui/ variable font / Safari UAunknownTest fixtures (46 samples)
english-samples.json— LTR (EN/ES/FR/DE)rtl-samples.json— Arabic, Hebrew, Urducjk-samples.json— Chinese, Japanese, Korean (incl.keep-allmode)complex-script-samples.json— Thai, Myanmar, Khmermixed-bidi-samples.json— Mixed RTL+LTRTests & docs
test/measurement-validator.test.ts— comparator + report-generator unit tests (fake DOM adapter, no browser required)test/classifier.test.ts— per-strategy unit tests, priority ordering, output shape validationdocs/measurement-validator.md— API referencedocs/classifier-guide.md— per-cause examples and confidence interpretationdocs/language-matrix.md— known divergences and browser compat per language groupOriginal prompt
Phase 2: Divergence Classifier & Multi-Language Support
OBJECTIVE
Implement intelligent root cause detection for measurement divergences and add support for multiple languages. This builds on Phase 1's foundation.
WHAT WE'RE BUILDING
Core Components
1. Divergence Classifier (
src/measurement-validator/classifier.ts)Identifies WHY measurements diverge between Pretext and DOM.
Detection Strategies:
Output:
2. Multi-Language Support
Expand from English-only to 6+ language groups:
3. Enhanced Test Suite (
src/measurement-validator/test-suite.ts)4. Multi-Language Fixtures
5. Enhanced Documentation
Directory Structure
FILES TO CREATE (Phase 2)
1. src/measurement-validator/classifier.ts
Root cause detection with 5 different strategies.
2. src/measurement-validator/test-suite.ts
Enhanced test runner with multi-language support.
3. test/classifier.test.ts
Unit tests for each detection strategy.
4. test/fixtures/rtl-samples.json
Arabic, Hebrew, Urdu test cases.
5. test/fixtures/cjk-samples.json
Chinese, Japanese, Korean test cases.
6. test/fixtures/complex-script-samples.json
Thai, Myanmar, Khmer test cases.
7. test/fixtures/mixed-bidi-samples.json
Mixed RTL/LTR test cases.
8. docs/classifier-guide.md
How to use and interpret classifier results.
9. docs/language-matrix.md
Language support and known issues per language.
IMPLEMENTATION DETAILS
Classifier Algorithm